智能论文笔记

Improving Multi-fidelity Optimization with a Recurring Learning Rate for Hyperparameter Tuning

HyunJae Lee , Gihyeon Lee , Junhwan Kim , Sungjun Cho , Dohyun Kim , Donggeun Yoo

分类：计算机视觉 | 机器学习

2022-09-26

尽管卷积神经网络（CNN）的演变发展，但它们的性能令人惊讶地取决于超参数的选择。但是，由于现代CNN的较长训练时间，有效探索大型超参数搜索空间仍然具有挑战性。多保真优化可以通过提前终止无主张的配置来探索更多的超参数配置。但是，它通常会导致选择亚最佳配置作为训练，并在早期阶段通常会缓慢收敛。在本文中，我们提出了具有重复学习率（MORL）的多余性优化，该率将CNNS的优化过程纳入了多性效率优化。莫尔减轻了缓慢启动的问题，并实现了更精确的低保真近似。我们对一般图像分类，转移学习和半监督学习的全面实验证明了MORL对其他多保真优化方法的有效性，例如连续减半算法（SHA）和HyperBand。此外，它可以在实际预算内进行手工调整的超参数配置的显着性能改进。

translated by 谷歌翻译

Simple Questions Generate Named Entity Recognition Datasets

Hyunjae Kim , Jaehyo Yoo , Seunghyun Yoon , Jinhyuk Lee , Jaewoo Kang

分类：自然语言处理

2021-12-16

命名实体识别（ner）是从文本中提取特定类型的命名实体的任务。当前的NER模型往往依赖于人类注释的数据集，要求在目标领域和实体上广泛参与专业知识。这项工作介绍了一个询问生成的方法，它通过询问反映实体类型的需求的简单自然语言问题来自动生成NER数据集（例如，哪种疾病？）到开放式域问题应答系统。不使用任何域中资源（即，培训句子，标签或域名词典），我们的模型在我们生成的数据集上仅培训了，这在很大程度上超过了四个不同域的六个基准测试的弱势监督模型。令人惊讶的是，在NCBI疾病中，我们的模型达到75.5 F1得分，甚至优于以前的最佳弱监督模型4.1 F1得分，它利用域专家提供的丰富的域名词典。制定具有自然语言的NER的需求，也允许我们为诸如奖项等细粒度实体类型构建NER模型，其中我们的模型甚至优于完全监督模型。在三个少量的NER基准测试中，我们的模型实现了新的最先进的性能。

translated by 谷歌翻译

Improving Tagging Consistency and Entity Coverage for Chemical Identification in Full-text Articles

Hyunjae Kim , Mujeen Sung , Wonjin Yoon , Sungjoon Park , Jaewoo Kang

分类：自然语言处理

2021-11-20

本文是关于我们的系统提交给生物重建VII轨道2挑战的化学识别任务的技术报告。这一挑战的主要特点是数据包括全文文章，而当前数据集通常由只有标题和摘要组成。为了有效解决该问题，我们的目的是使用各种方法改进标记一致性和实体覆盖，例如在与命名实体识别（ner）的相同文章中的多数投票和组合字典和神经模型进行归一化的混合方法。在NLM-Chem数据集的实验中，我们表明我们的方法改善了模型的性能，特别是在召回方面。最后，在对挑战的官方评估中，我们的系统通过大幅表现出基线模型和来自16支队伍的超过80个提交来排名第一。

translated by 谷歌翻译

PhaseAug: A Differentiable Augmentation for Speech Synthesis to Simulate One-to-Many Mapping

Junhyeok Lee , Seungu Han , Hyunjae Cho , Wonbin Jung

分类：人工智能

2022-11-08

Previous generative adversarial network (GAN)-based neural vocoders are trained to reconstruct the exact ground truth waveform from the paired mel-spectrogram and do not consider the one-to-many relationship of speech synthesis. This conventional training causes overfitting for both the discriminators and the generator, leading to the periodicity artifacts in the generated audio signal. In this work, we present PhaseAug, the first differentiable augmentation for speech synthesis that rotates the phase of each frequency bin to simulate one-to-many mapping. With our proposed method, we outperform baselines without any architecture modification. Code and audio samples will be available at https://github.com/mindslab-ai/phaseaug.

translated by 谷歌翻译

Enhancing Semantic Understanding with Self-supervised Methods for Abstractive Dialogue Summarization

Hyunjae Lee , Jaewoong Yun , Hyunjin Choi , Seongho Joe , Youngjune L. Gwon

分类：自然语言处理 | 人工智能

2022-09-01

上下文化的单词嵌入会导致自然语言理解中最新的表演。最近，诸如BERT之类的预先训练的深层上下文化的文本编码器显示了其在改善包括抽象性摘要在内的自然语言任务方面的潜力。对话摘要中的现有方法着重于将大型语言模型纳入摘要任务，该任务是在大规模语料库中培训的，这些任务由新闻文章组成，而不是多个演讲者的对话。在本文中，我们介绍了自我监督的方法，以补偿培训对话摘要模型的缺点。我们的原则是使用借口对话文本检测不一致的信息流，以增强伯特对对话文本表示形式的上下文能力。我们使用增强的BERT在共享的编码器架构上构建并微调一个抽象的对话摘要模型。我们通过Samsum语料库（Samsum copus）进行了验证评估我们的抽象对话摘要，这是一个最近介绍的带有抽象性对话摘要的数据集。我们所有的方法都为在胭脂分数中测得的抽象摘要做出了改进。通过一项广泛的消融研究，我们还向关键模型超参数，切换话语和掩盖对话者的概率提出了灵敏度分析。

translated by 谷歌翻译

HTML版本

SANE-TTS: Stable And Natural End-to-End Multilingual Text-to-Speech

Hyunjae Cho , Wonbin Jung , Junhyeok Lee , Sang Hoon Woo

分类：机器学习

2022-06-24

在本文中，我们提出了Sanane-TTS，这是一种稳定且自然的端到端多语言TTS模型。由于很难为给定的演讲者获得多语言语料库，因此不可避免地会使用单语语料库进行多语言TTS模型。我们介绍了扬声器正规化损失，该损失可改善跨语性合成期间的语音自然性以及域对抗训练，该训练适用于其他多语言TTS模型。此外，通过添加扬声器正规化损失，以持续时间为零矢量嵌入的扬声器可以稳定跨语性推断。通过此替代品，我们的模型将产生以中等节奏的语音，而不论跨语性合成中的源说话者如何。在MOS评估中，Sane-TTS在跨语义和内部合成中的自然性得分高于3.80，地面真相评分为3.99。同样，即使在跨语性的推论中，Sane-TTS也保持了接近地面真理的说话者相似性。音频样本可在我们的网页上找到。

translated by 谷歌翻译

Class-Continuous Conditional Generative Neural Radiance Field

Jiwook Kim , Minhyeok Lee

分类：计算机视觉 | 人工智能

2023-01-03

The 3D-aware image synthesis focuses on conserving spatial consistency besides generating high-resolution images with fine details. Recently, Neural Radiance Field (NeRF) has been introduced for synthesizing novel views with low computational cost and superior performance. While several works investigate a generative NeRF and show remarkable achievement, they cannot handle conditional and continuous feature manipulation in the generation procedure. In this work, we introduce a novel model, called Class-Continuous Conditional Generative NeRF ($\text{C}^{3}$G-NeRF), which can synthesize conditionally manipulated photorealistic 3D-consistent images by projecting conditional features to the generator and the discriminator. The proposed $\text{C}^{3}$G-NeRF is evaluated with three image datasets, AFHQ, CelebA, and Cars. As a result, our model shows strong 3D-consistency with fine details and smooth interpolation in conditional feature manipulation. For instance, $\text{C}^{3}$G-NeRF exhibits a Fr\'echet Inception Distance (FID) of 7.64 in 3D-aware face image synthesis with a $\text{128}^{2}$ resolution. Additionally, we provide FIDs of generated 3D-aware images of each class of the datasets as it is possible to synthesize class-conditional images with $\text{C}^{3}$G-NeRF.

translated by 谷歌翻译

A contrastive learning approach for individual re-identification in a wild fish population

Ørjan Langøy Olsen , Tonje Knutsen Sørdalen , Morten Goodwin , Ketil Malde , Kristian Muri Knausgård , Kim Tallaksen Halvorsen

分类：计算机视觉 | 人工智能 | 机器学习

2023-01-02

In both terrestrial and marine ecology, physical tagging is a frequently used method to study population dynamics and behavior. However, such tagging techniques are increasingly being replaced by individual re-identification using image analysis. This paper introduces a contrastive learning-based model for identifying individuals. The model uses the first parts of the Inception v3 network, supported by a projection head, and we use contrastive learning to find similar or dissimilar image pairs from a collection of uniform photographs. We apply this technique for corkwing wrasse, Symphodus melops, an ecologically and commercially important fish species. Photos are taken during repeated catches of the same individuals from a wild population, where the intervals between individual sightings might range from a few days to several years. Our model achieves a one-shot accuracy of 0.35, a 5-shot accuracy of 0.56, and a 100-shot accuracy of 0.88, on our dataset.

translated by 谷歌翻译

Learning to Maximize Mutual Information for Dynamic Feature Selection

Ian Covert , Wei Qiu , Mingyu Lu , Nayoon Kim , Nathan White , Su-In Lee

分类：机器学习 | (统计)机器学习

2023-01-02

Feature selection helps reduce data acquisition costs in ML, but the standard approach is to train models with static feature subsets. Here, we consider the dynamic feature selection (DFS) problem where a model sequentially queries features based on the presently available information. DFS is often addressed with reinforcement learning (RL), but we explore a simpler approach of greedily selecting features based on their conditional mutual information. This method is theoretically appealing but requires oracle access to the data distribution, so we develop a learning approach based on amortized optimization. The proposed method is shown to recover the greedy policy when trained to optimality and outperforms numerous existing feature selection methods in our experiments, thus validating it as a simple but powerful approach for this problem.

translated by 谷歌翻译

Design, Modeling, and Evaluation of Separable Tendon-Driven Robotic Manipulator with Long, Passive, Flexible Proximal Section

Christian DeBuys , Florin C. Ghesu , Jagadeesan Jayender , Reza Langari , Young-Ho Kim

分类：机器人

2023-01-01

The purpose of this work was to tackle practical issues which arise when using a tendon-driven robotic manipulator with a long, passive, flexible proximal section in medical applications. A separable robot which overcomes difficulties in actuation and sterilization is introduced, in which the body containing the electronics is reusable and the remainder is disposable. A control input which resolves the redundancy in the kinematics and a physical interpretation of this redundancy are provided. The effect of a static change in the proximal section angle on bending angle error was explored under four testing conditions for a sinusoidal input. Bending angle error increased for increasing proximal section angle for all testing conditions with an average error reduction of 41.48% for retension, 4.28% for hysteresis, and 52.35% for re-tension + hysteresis compensation relative to the baseline case. Two major sources of error in tracking the bending angle were identified: time delay from hysteresis and DC offset from the proximal section angle. Examination of these error sources revealed that the simple hysteresis compensation was most effective for removing time delay and re-tension compensation for removing DC offset, which was the primary source of increasing error. The re-tension compensation was also tested for dynamic changes in the proximal section and reduced error in the final configuration of the tip by 89.14% relative to the baseline case.

translated by 谷歌翻译